Fix CLI token source --profile fallback with version detection#1605
Fix CLI token source --profile fallback with version detection#1605mihaimitrea-db wants to merge 1 commit intomainfrom
Conversation
4cb4c26 to
82d9599
Compare
Range-diff: stack/force-refresh-flag (4cb4c26 -> 82d9599)
Reproduce locally: |
82d9599 to
68d45f4
Compare
Range-diff: stack/force-refresh-flag (82d9599 -> 68d45f4)
Reproduce locally: |
68d45f4 to
4a5079f
Compare
Range-diff: stack/force-refresh-flag (68d45f4 -> 4a5079f)
Reproduce locally: |
4a5079f to
6f4fead
Compare
Range-diff: stack/force-refresh-flag (4a5079f -> 6f4fead)
Reproduce locally: |
6f4fead to
2218270
Compare
Range-diff: stack/force-refresh-flag (6f4fead -> 2218270)
Reproduce locally: |
2218270 to
76e74ca
Compare
Range-diff: stack/force-refresh-flag (2218270 -> 76e74ca)
Reproduce locally: |
| var exitErr *exec.ExitError | ||
| if errors.As(err, &exitErr) { | ||
| return nil, fmt.Errorf("cannot get access token: %s", strings.TrimSpace(string(exitErr.Stderr))) | ||
| return nil, fmt.Errorf("cannot get access token: %q", strings.TrimSpace(string(exitErr.Stderr))) |
There was a problem hiding this comment.
Sorry, what I meant is that we are losing the exec.ExitError type. The new type that we have introduced does not solve that.
| // Token fetches an OAuth token by shelling out to the Databricks CLI. | ||
| // If the working command has already been identified, it is called directly. | ||
| // Otherwise, [probeAndExec] tries each command in order to find one that works. | ||
| func (c *CliTokenSource) Token(ctx context.Context) (*oauth2.Token, error) { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| return c.execCliCommand(ctx, c.hostCmd) | ||
| // probeAndExec walks the command list from most-featured to simplest, looking | ||
| // for a CLI command that succeeds. When a command fails with "unknown flag" for | ||
| // one of its [cliCommand.usedFlags], it logs a warning and tries the next. |
There was a problem hiding this comment.
| // one of its [cliCommand.usedFlags], it logs a warning and tries the next. | |
| // one of its [cliCommand.usedFlags], it logs a warning and tries the next. The "counter" is not incremented if the command fails for any other reason. |
| var exitErr *exec.ExitError | ||
| if errors.As(err, &exitErr) { | ||
| return nil, fmt.Errorf("cannot get access token: %s", strings.TrimSpace(string(exitErr.Stderr))) | ||
| return nil, fmt.Errorf("cannot get access token: %w", |
There was a problem hiding this comment.
From offline discussion: we are loosing the exec.ExitError type. I don't have an opinion on whether we should keep it or not. Though, if we don't let's be explicit about the rationale.
79ac794 to
a19dab0
Compare
Range-diff: stack/force-refresh-flag (79ac794 -> a19dab0)
Reproduce locally: |
|
Update: Reopening this PR with a new approach. The try-and-retry fallback mechanism from the previous iteration does not work for `--profile` because it is a global Cobra flag — old CLIs (< v0.207.1) silently accept it instead of reporting `"unknown flag: --profile"`, failing later with `"cannot fetch credentials"`. This made the `--host` fallback dead code. The new approach uses version detection: run `databricks version` at init time, parse the semver, and use it to decide which flags the CLI supports. This also enables passing `--force-refresh` (databricks/cli#4767) when the CLI version supports it. See the updated PR description for details. |
a19dab0 to
6c77eda
Compare
d7431c4 to
0572f6e
Compare
0572f6e to
4280bbe
Compare
The --profile flag is a global Cobra flag in the Databricks CLI. Old CLIs (< v0.207.1) silently accept it on `auth token` but fail with "cannot fetch credentials" instead of "unknown flag: --profile". This made the previous error-based fallback to --host dead code. Replace the try-and-retry approach with version-based detection: run `databricks version` at init time and use the parsed semver to decide between --profile and --host. This also simplifies CliTokenSource to a single resolved command with no runtime probing. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
4280bbe to
271785e
Compare
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
Summary
Fix the broken
--profilefallback inCliTokenSourceby replacing error-based detection with version-based CLI detection at init time.Why
The
--profileflag ondatabricks auth tokenis a global Cobra flag (defined as a persistent flag on the root command). Old CLIs (< v0.207.1) silently accept it — they don't report"unknown flag: --profile"but instead fail later with"cannot fetch credentials". This means the existingisUnknownFlagErrorcheck (config/cli_token_source.go:120) never matches, and the--hostfallback is dead code.This was verified by testing against CLI v0.207.0 vs v0.207.1:
databricks auth token --profile workspace→Error: init: cannot fetch credentials(not "unknown flag")databricks auth token --profile workspace→ returns a valid tokenApproaches considered
Three approaches were evaluated for detecting whether the installed CLI supports
--profile:Error-based detection (try-and-retry) — the current approach on
main. Rundatabricks auth token --profile <name>and check whether the error contains"unknown flag: --profile". This is broken: because--profileis a global Cobra flag, old CLIs accept it silently and fail with a different error ("cannot fetch credentials"), so the fallback to--hostnever triggers.--helpflag parsing (databricks auth token --help+ substring matching) was rejected because the--helpoutput format is not a stable API. More importantly,--profilewould appear in--helpoutput even on old CLIs that don't actually implement profile-based token lookup — it shows up because it's a global persistent flag, not because theauth tokensubcommand uses it. This approach has the same fundamental flaw as error-based detection.Version detection (
databricks version+ semver comparison) — the approach taken here. Rundatabricks versionat init time, parse the semver (e.g.,"Databricks CLI v0.207.1"), and compare against known minimum versions for each flag. This is reliable because the version string is a stable output format, and the mapping between flags and CLI versions is well-defined (databricks/cli#855 for--profilein v0.207.1). If version detection fails, the SDK falls back to the most conservative command (--hostonly).References
--profilesupport added in CLI v0.207.1: databricks/cli#855 (Oct 2023)What changed
Interface changes
None.
CliTokenSourceis not part of the public API surface.NewCliTokenSourcenow takescontext.Contextas its first parameter, needed forexec.CommandContextwhen runningdatabricks version. This is consistent with everyCredentialsStrategy.Configuremethod in the codebase, and the single caller (auth_u2m.go) already hasctxin scope.Behavioral changes
cfg.Profileis set but the CLI is too old (< v0.207.1), the SDK now correctly falls back to--host. Previously this fallback was dead code.--profileflag is not supported and the SDK falls back to--host.Internal changes
cliVersiontype: semver parsing withAtLeast()comparison andString()formatting.getCliVersion(ctx, cliPath): runsdatabricks versionand parses the output.parseCliVersion: parses"Databricks CLI v0.207.1"→cliVersion{0, 207, 1}.resolveCliCommand: bridges version detection and command building. Falls back to zero version on detection failure.buildCliCommand: pure function — takes a version, returns a single resolved command. No exec calls, easy to test.CliTokenSourcesimplified to a singlecmd []stringfield. NohostCmd, no runtime fallback.Token()is now one line:return c.execCliCommand(ctx, c.cmd).isUnknownFlagError,buildCliCommands(plural),buildHostCommand,hostCmdfield.How is this tested?
Manual tests on versions 0.207.0 and 0.207.1
Unit tests in
config/cli_token_source_test.go:TestParseCliVersion— standard versions, patch versions, malformed output, empty string, missing prefix.TestCliVersion_AtLeast— equal, higher/lower patch/minor/major, zero vs zero, zero vs nonzero.TestBuildCliCommand— table-driven: version x config → expected command. Covers: host-only, account host, profile+new CLI (uses --profile), profile+old CLI (falls back to --host), profile-only+old CLI (nil), zero version (detection failed, falls back to --host), neither profile nor host (nil).TestNewCliTokenSource— success with host, success with profile, CLI not found, neither profile nor host.TestCliTokenSource_Token— success, CLI error, invalid JSON.